Using Gaussian Measures for Efficient Constraint Based Clustering
نویسندگان
چکیده
In this paper we present a novel iterative multiphase clustering technique for efficiently clustering high dimensional data points. For this purpose we implement clustering feature (CF) tree on a real data set and a Gaussian density distribution constraint on the resultant CF tree. The post processing by the application of Gaussian density distribution function on the micro-clusters leads to refinement of the previously formed clusters thus improving their quality. This algorithm also succeeds in overcoming the inherent drawbacks of conventional hierarchical methods of clustering like inability to undo the change made to the dendogram of the data points. Moreover, the constraint measure applied in the algorithm makes this clustering technique suitable for need driven data analysis. We provide veracity of our claim by evaluating our algorithm with other similar clustering algorithms.
منابع مشابه
Multi-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms
Due to the resource constraint and dynamic parameters, reducing energy consumption became the most important issues of wireless sensor networks topology design. All proposed hierarchy methods cluster a WSN in different cluster layers in one step of evolutionary algorithm usage with complicated parameters which may lead to reducing efficiency and performance. In fact, in WSNs topology, increasin...
متن کاملAn Efficient Predictive Model for Probability of Genetic Diseases Transmission Using a Combined Model
In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accurate prediction of disease transmission under the same condition and based on a series of measures like the positive predictive value, negative predictive value, accuracy, sensitivit...
متن کاملMLCA: A Multi-Level Clustering Algorithm for Routing in Wireless Sensor Networks
Energy constraint is the biggest challenge in wireless sensor networks because the power supply of each sensor node is a battery that is not rechargeable or replaceable due to the applications of these networks. One of the successful methods for saving energy in these networks is clustering. It has caused that cluster-based routing algorithms are successful routing algorithm for these networks....
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملAn Efficient Bi-objective Genetic Algorithm for the Single Batch-Processing Machine Scheduling Problem with Sequence Dependent Family Setup Time and Non-identical Job Sizes
This paper considers the problem of minimizing make-span and maximum tardiness simultaneously for scheduling jobs under non-identical job sizes, dynamic job arrivals, incompatible job families,and sequence-dependentfamily setup time on the single batch- processor, where split size of jobs is allowed between batches. At first, a new Mixed Integer Linear Programming (MILP) model is proposed for t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1411.3302 شماره
صفحات -
تاریخ انتشار 2014